perm filename D[IJ,DBL] blob
sn#131990 filedate 1974-11-21 generic text, type T, neo UTF8
00100 .DEVICE XGP
00200
00300 .FONT 1 "NGR25"
00400 .FONT 2 "NGB25"
00500 .FONT 4 "BDI25"
00600 .FONT 5 "NGB30"
00700 .FONT 6 "FIX20"
00800 .FONT 8 "FIX25"
00900 .TURN ON "↓_π{"
01000 .TURN ON "⊗" FOR "%"
01100 .MACRO B ⊂ BEGIN VERBATIM GROUP ⊃
01200 .MACRO E ⊂ APART END ⊃
01300 .TABBREAK
01400 .COMPACT
01500 .SELECT 1
00100 .PAGE FRAME 44 HIGH 82 WIDE
00200
00300 .AREA TEXT LINES 1 TO 44
00400
00500 .SPACING 70 MILLS
00600 .PREFACE 200 MILLS
00700 .NEXT PAGE
00800 .BEGIN CENTER
00900 ⊗5 BEINGS: REPRESENTATION OF KNOWLEDGE AS INTERACTING MODULES⊗*
01000
01100 Douglas B. Lenat
01200 Stanford Artificial Intelligence Laboratory
01300 Stanford, California
01400
01500 ⊗5↓_ABSTRACT_↓⊗*
01600
01700 .END
01800
01900 .INDENT 6
02000
02100 Knowledge, including that of control, may be organized as a
02200 community of interacting modules (e.g., ACTORS). By granting each
02300 module a complex structure, complex behaviors are easily elicited. By
02400 constraining that this structure be standard over the entire
02500 community, the advantages of a uniform formalism are preserved.
02600 Several task domains are considered, and the most natural is seen to
02700 be automatic programming. An experimental system was partially
02800 implemented for this task. It has synthesized a few large inductive
02900 inference LISP programs heuristically, from specific restricted
03000 dialogues. This research clarified some difficulties inherent to
03100 structured modular systems.
00100
00200 .B
00300
00400 .E
00500 .ONCE CENTER
00600 ⊗5↓_1. INTRODUCTION_↓⊗*
00700
00800 This paper reports on a scheme for representing knowledge as a pool of
00900 structured units, called BEINGs. Each module has several ⊗4parts⊗*, each of
01000 which is either ⊗4primitive⊗* or a pointer to another BEING. The name of
01100 a BEING part represents a question one expert might want to ask another; the
01200 value of the part is that BEING's answer to that question. Thus the parts
01300 correspond to the messages in the ACTOR [Hewitt] formalism. A new constraint is
01400 introduced here, however: every BEING has the same set of parts (with
01500 distinct values, of course).
01600 This uniformity was hoped to permit easy addition of new BEINGs to the pool,
01700 and to facilitate communication among the BEINGs.
01800
01900 A small community of BEINGs was formed, aimed at generating LISP programs
02000 from dialogues with a human user. This experimental system, PUP6, was not
02100 a formal automatic programming program, but rather used the BEINGs to
02200 organize informal knowledge about programming, about the task domain
02300 (inductive inference), and about transfer of control. Three programs have
02400 actually been synthesized by PUP6:
02500 a concept formation program (similar to [Winston]),
02600 a gramatical inference program, and a simple property list maintenance
02700 routine (referred to below as CF, GI, and PL, respectively). Specification
02800 is via rigid dialogue, carried on at great length with the user, in a
02900 small subset of English, in which the task is delineated and
03000 questionable details are discussed. The specification is partial
03100 (ambiguous); the system makes assumptions continually. This process
03200 will be referred to as ⊗4automatic programming.⊗* The code which has
03300 been successfully generated is called the ⊗4target program⊗*.
03400
03500 The main successes of PUP6 were that the desired reasoning
03600 steps in the original human-programmer
03700 protocol were actually simulated, most of the BEINGs
03800 were called on in writing all three programs, and a new "style" of target
03900 program was discovered (they can be interrupted and queried, just as can
04000 the original pool of BEINGs).
04100 The main
04200 defects were its inflexibility to new dialogues, the
04300 inability of an untrained user to add new BEINGs, and the verbosity of the
04400 system.
04500
04600 Our treatment will follow these lines:
04700 First, the ideas of the BEINGs scheme are presented. Examples are then
04800 supplied to illustrate these concepts. The specific
04900 implementation in the domain of automatic programming is
05000 sketched.
05100 Some of its difficulties are shown
05200 to reflect on the BEINGs ideas themselves.
05300
00100
00200 .B
00300
00400 .E
00500 .ONCE CENTER
00600 ⊗5↓_2. General BEING System Ideas_↓
00700
00800 How should knowledge be represented? Many researchers feel
00900 that a simple, uniform formalism should be used for all facts; others
01000 disagree, claiming that complexity of behavior both justifies and
01100 demands complexity of representation.
01200 The benefits of the former include easy
01300 addition of knowledge [Newell] and simple, aesthetic methods for
01400 communicating and combining information [McCarthy].
01500 Structure, however, is necessary
01600 for (⊗4combinatorially⊗*) efficient handling of large amounts of data
01700 (see the real world; also
01800 [MIT]).
01900 A BEING is a collection of several little bits of procedural
02000 code; the answers to questions about the BEING.
02100 That is, a BEING is a small, loosely-knit program, which is
02200 considered ⊗4equivalent⊗* to its answers to these fixed questions. Every
02300 piece of knowledge, including those related to the flow of
02400 control in the system, should be encoded
02500 into BEINGs. There should be nothing else in the system but this
02600 interacting community.
02700 Thus BEINGs represent a compromise of structure and uniformity.
02800 While each BEING is highly structured, this
02900 structure is standard over the entire pool. Each BEING part is itself
03000 a little BEING, etc., and this infinite regress stops when the
03100 contents of a BEING part becomes sufficiently primitive.
03200 The BEINGs operate at about the same level as a
03300 human performing the same task;
03400 BEINGs should consider as primitive just what he considers primitive.
03500 For example, in the automatic programming task (AP), this level
03600 was typically the same as the INTERLISP [Teitelman]
03700 language: a primitive was a single INTERLISP function
03800 call, or a few simple ones connected trivially. Each BEING is
03900 cognizant of the set of thirty questions, in the sense that in
04000 answering one of them it may freely ask questions of other BEINGs
04100 (often through nondeterministic goal statements.) A few of the
04200 BEING-PARTS for AP might be: what is your basic idea and purpose, what
04300 effects do you have, how do you cause them, what must be
04400 ensured before you begin, what is your
04500 chance of success, whom else might you transfer
04600 control to, what alternatives to you exist,
04700 do you evaluate your arguments, what
04800 is the nature of the value you return, why do you
04900 want to be in control now, etc.
05000 The delineation of this set of questions has much to do with
05100 the epistemology of programming.
05200 Each BEING part has two abilities: it may be
05300 ⊗4asked⊗* about something, and it may be ⊗4called⊗* on to do
05400 something. Each of these may involve asking and calling other parts
05500 of itself and of other BEINGs, but typically the second activity
05600 involves an extra level of evaluation.
05700 For example, the ARGS part may be asked simple
05800 questions about the arguments to the BEING, and it is charged with
05900 binding the arguments when the BEING is given control.
06000 The BEINGs control themselves in a simple way. A very general
06100 BEING, SERVE-THE-USER, has control initially. In general, some BEING
06200 ⊗4B⊗* will be in control. Its BEING parts are assembled into an
06300 executable function, and it is run. During the course of its reign,
06400 ⊗4B⊗* will want things done and/or tested which it cannot manage by
06500 itself. This corresponds to when a normal program needs to call a
06600 subroutine. What ⊗4B⊗* does is to call on SATISFY, a BEING which
06700 is the system's general goal mechanism. All it need do is accept a
06800 description of what is needed, then ask the entire pool "who can
06900 do this?" Another general BEING, CHOOSE-FROM, then ranks those which
07000 respond. These two "special" BEINGs seem
07100 to detract from the equality
07200 proclaimed earlier for all BEINGs. But the mechanism for goal
07300 satisfaction and for choosing among vying BEINGs is standard; either
07400 it must be duplicated inside every BEING or else factored out into
07500 some higher-level (meta-, interaction-) BEINGs.
07600 Similarly, the way a BEING's parts fit together
07700 is uniform over all the BEINGs at all times. Thus one simple function
07800 (or assembled BEING; see 3.3) can assemble all the BEINGs initially
07900 into LISP functions.
08000 BEINGs are the only entities permitted (theoretically) to
08100 exist in our system; ergo, for the AP task, all the synthesized code
08200 must be written as BEINGs, and
08300 must be written by BEINGs.
08400 An obvious but crucial consequence is that ⊗4some⊗* BEING(s)
08500 must write new BEINGs. The significant design decision was
08600 that the BEING who knows about
08700 ⊗4X⊗* takes charge of generating BEINGs relating to ⊗4X⊗*. For
08800 example, the INSERT BEING can do inserting by one of a few
08900 algorithms, and he can also write (by specializing himself) more
09000 efficient, special-purpose insert routines,
09100 and he can answer questions about
09200 inserting. This idea is analogous to any reliance upon experts
09300 [Feigenbaum], and also reemphasizes the theme of any modular
09400 representation of knowledge.
09500 A second consequence is that the output code will have all
09600 the ⊗4types⊗* of "intelligence" that any BEING community has: it
09700 can write variations of itself, modify itself according to the user's
09800 complaints, and answer questions about what it is doing as it runs.
09900 In a similar vein, ⊗4some⊗* BEING(s) must do the translation
10000 of the users' quasi-English inputs into BEING-usable form. ↓_Each_↓
10100 BEING ⊗4X⊗* is charged with recognizing English related to ⊗4X⊗*.
10200 Thus translation
10300 consists of querying "who can recognize ..." and waiting for a
10400 response. For example, our INSERT BEING must also recognize and
10500 process phrases involving inserting, stack-pushing, and merging.
10600 A result of this treatment of natural language
10700 processing is that any phrase which can be translated can be acted
10800 upon.
10900 Why is automatic programming (AP) a good task for a
11000 BEINGs pool?
11100 The BEINGs representation may be suitable for simulating any
11200 complex task ⊗4T⊗* involving frequent small interventions by various
11300 experts. In addition to writing programs, this activity could
11400 perhaps be as
11500 diverse as playing chess, coordinating sensors/effectors,
11600 simulating physical
11700 interactions, doing research in mathematics, or playing volleyball.
11800 Some of these will be elaborated in section 3.
11900 There ⊗4is⊗* one particular bias of BEINGs toward writing
12000 high-level programs, over other activities. The new BEINGs created
12100 during the execution of a specified task are kept distinct from the
12200 original set of BEINGs. Then a ⊗4program⊗* for task ⊗4T⊗* is
12300 accomplished by doing ⊗4T⊗* and then dumping the new BEINGs out onto
12400 a new file. The entire old BEINGs pool is then treated as the
12500 language supporting this file. (As a corollary, asking a BEINGs-based
12600 system to write any subset of itself is the null task.) Henceforth, the
12700 task for the BEINGs pool will be AP
12800 unless explicitly stated otherwise.
12900 A few of the "new ideas" may be disguised new biasses.
13000 These include: One need not feel any reverence toward the
13100 exploratory anthropomorphic paradigm (common to programming, doing math,
13200 etc.) of ignoring details and catching them
13300 later by debugging. As in all complex behavior,
13400 decisions continually crop up which
13500 no BEING is able to resolve at the time. The BEINGs system
13600 should always spend a significant effort to defer the decision as
13700 long as possible. When, at last, no progress can be made without its
13800 resolution, and if the system is still unsure, then either the user
13900 settles the question or else a backtrack point is reluctantly set up.
14000 Hopefully, by this time, some new information is present which
14100 enables the system to resolve the decision, thus reducing the amount
14200 of backtracking. If there are two or more decisions which can no
14300 longer be deferred, the system tackles first the one estimated to be
14400 the easiest (analogous to Occam's razor).
14500 Another prejudice is that most of the carelessness
14600 "bugs" can be eliminated through this deferral, feed-forward, and very
14700 precise record-keeping. Humans depend on their adaptability to
14800 compensate for limitations in their brain hardware
14900 [Newell], but there is no
15000 need for an ⊗4automatic⊗* programming system to do so. For example,
15100 when a list structure is first encountered, the system records
15200 warnings that it is undefined, unaccessed,
15300 and so on. Each such worry is weighted and deferred, but not
15400 forgotten.
15500
15600 Most programmers intentionally augment their code to aid in
15700 later debugging, modification, and readability by humans. These aids
15800 are typically comments, summaries, over-generalized subroutines,
15900 optional print statements, and runtime statistics. Recently, the
16000 structure of the program itself has also been
16100 recognized as a powerful
16200 manipulable element, affecting the accessability of the code, not just
16300 its length or running time.
16400 Any program written as a pool of
16500 BEINGs is several times as long as a clean hand-coded version. This
16600 extra code, the parts of each new BEING which are superfluous, may be
16700 viewed as well-organized self-documentation. They should improve the
16800 debugging, expansion, and accessibility (to alien users) of the
16900 synthesized code.
17000 Some assertions are so meaningful to the user,
17100 that they should be stored in the code itself, even if they are
17200 only temporary. At any time, the user
17300 may look at a piece of code; the comments present ⊗4then⊗* are all
17400 assertions pertaining to that piece of code at that time.
17500 BEINGS may use comments at several different levels. At the
17600 lowest level, each comment is merely a unique token; it may be
17700 inserted, removed, or searched for. Slightly better, the BEINGs
17800 system can ask "is there a comment involving ...". For this purpose, a
17900 constrained syntax for the comments should be developed. Otherwise
18000 (as, unfortunately in PUP6) each new comment must be attended to
18100 ad hoc. Still higher level
18200 usage would be looking at a comment totally
18300 ignorant of it, and ⊗4translated⊗* it into something meaningful.
18400 By studying the difficulties of the implementations of the BEINGs
18500 ideas, isolating those due to poor programming from those due to
18600 poor ideas, enough may be learned to build the next system one
18700 qualitative step closer to the ideal.
18800 It is in this spirit that BEINGs
18900 are now contrasted against the concepts of ACTORs, CLASSes,
19000 FRAMES, HACKER, formal AP systems, and macro expansion schemes.
19100 ACTORS [Hewitt] provided the key concept of integrating
19200 uniformity of construction with sophistocation of behavior. There
19300 is a continuum, among modular knowledge schemes, of standardization of
19400 "message" types between modules. ACTORs have no restriction whatsoever
19500 on this format. Each module has its own, unique parts (types of
19600 answers). So each ACTOR must be aware of all the parts (message formats)
19700 of all the ACTORs it ever is going to communicate with.
19800 Adding a new module is thus conceptually intricate as
19900 well as practically difficult. CLASSes [..........] have a few standard
20000 parts, and the modules are arranged in groups, each of which has its
20100 own additional types of parts. Thus a module can ask ⊗4any⊗* other module
20200 one of a few universal questions, and it can ask any module in its group
20300 an additional set of questions. If it is permitted to know about other
20400 groups' special parts, then the same adding problem recurs. If it is
20500 denied such knowledge, it cannot access much of the knowledge in the
20600 pool. If one requires a completely universal set of message types, then
20700 most of them may be irrelevant to most modules. This is the price which
20800 BEINGs pay. Later, it will be shown that this superfluity is real and is
20900 marginally tolerable. The most devastating criticism of striving for a
21000 universal set of module questions is that all this does is push all the
21100 non-uniformity down into the ⊗4values⊗* of these parts. The only retort is
21200 empirical: if a good partitioning of the questions can be found, then
21300 the internal structure of each part with the same name will be comparable,
21400 and the only communication necessary will be simple accessing of
21500 modules's parts (direct questioning).
21600
21700 FRAMES [Minsky] seems superficially similar to BEINGs, and
21800 are so amorphous that they actually could subsume BEINGs. There is a
21900 deep difference of philosophy and of intended usage, however.
22000 FRAMES intentionally have default values already (with no processing)
22100 filled in for each frame, and replaced only when necessary.
22200 This is akin to automatic programming by blind debugging, but is
22300 antithetical to the fastidious bookkeeping BEINGs philosophy. As an
22400 example, consider the writing of a short, recursive
22500 LISP program (reverse, flatten, factorial, alternate, etc.)
22600 A human, consulting the proper frame in his head, knows to ignore the
22700 base step for a while, then fill it in, usually as ⊗4if NIL, then
22800 NIL⊗*. The
22900 human makes this default synthesis, tries out the program, and looks
23000 closer (to fill in this "slot" properly) only if something calls his
23100 attention to it (such as a LISP error message:
23200 NON-NUMERIC ARG ⊗4NIL⊗*, e.g., if what was really needed was the base
23300 step ⊗4if NIL, then 0⊗*).
23400 A BEINGs system would
23500 also defer working on the base step, but could -- and would -- place
23600 a note about that deferral into the assertional warning base. Before
23700 thinking it was finished, the system would attend to that note
23800 carefully. This is not a criticism of FRAMES:
23900 they are meant to model
24000 perception, aBEINGs are not.
24100 HACKER [Sussman] differs from BEINGs in the same qualitative
24200 way as FRAMES do.
24300 The orientation of HACKER is to put together pieces of programs
24400 into something close to the final desired target, then try and run
24500 it. When errors result, they are analyzed and then patched. BEINGs
24600 is oriented toward writing bug-free code; what it writes must always
24700 coincide with its internal specifications of that code. The user may
24800 later change his mind, and a BEINGs system must be able to modify its
24900 own programs, but this maintenance
25000 process is much better defined than blind
25100 debugging. On the other hand, if a BEINGs system does have an
25200 unexpected bug in it, it may not be able to recover from it. One
25300 way to overcome this would be to include special recover-from-bugs
25400 BEINGs.
25500 The formal manipulation systems which also synthesize code
25600 are so different from BEINGs, that it is difficult to compare them.
25700 These logical systems (e.g., [Luckham]) attack an entirely different
25800 problem. Their task is specified rigorously in advance, and they
25900 proceed formally to search for a solution program. BEINGs are
26000 designed to work on a much higher level, heuristically. Each action
26100 is implicitly justified by the fact that the user can later describe
26200 to the system
26300 what is undesirable about the program it's generated. BEINGs should
26400 thus be tolerant of user ambiguity and error.
26500 Like ⊗4any⊗* system which writes structured programs, the
26600 external behavior of a BEINGs system applied to AP is
26700 reminiscent of macro expansion regardless of ⊗4what⊗* the internal
26800 BEING interactions are. One major distinction between the two
26900 concepts is when and how the macros are expanded. Traditional answers
27000 are: at compile time, by rigid substitutions.
27100 BEINGs systems expand their "macros" at
27200 times they choose, using knowledge gleaned from each other and from
27300 the user. For example, consider a series of macro calls occurring in
27400 the code: (m1)(m2)(m3). A macro expander could expand these in any order,
27500 and the three activities could go on simulatneously, with no interference
27600 or change in the final code. BEINGs would try to find some task common
27700 to all three calls, and work on that first. The order of which to
27800 first expand fully would be carefully weighed,
27900 and during that expansion new
28000 information would be learned which could affect the expansions of the
28100 other two function calls. The final code would not simply be the APPEND
28200 of the expansions of m1, m2, m3. As macro expansion schemes increase
28300 in sophistocation, it may someday be appropriate to refer to BEINGs as
28400 such a system.
00100
00200 .B
00300
00400 .E
00500 .ONCE CENTER
00600 ⊗5↓_3. Specific Examples_↓
00700
00800 .B
00900
01000 .E
01100 .ONCE FLUSH LEFT
01200 ↓_3.1. The set of BEING parts used in the automatic programming task_↓
01300
01400 Below are listed the actual set of BEING parts which were used in the
01500 PUP6 system. The overall task for the BEINGs is to write programs --
01600 i.e., more BEINGs. Next to each part is the percentage of BEINGS in the
01700 system which actually needed to have this part specified, and a brief
01800 description of what the part does.
01900
02000 .BEGIN NOFILL
02100 .FLUSH LEFT
02200
02300 ⊗2IDEN⊗* 54% How is this BEING referenced in English phrases? Implemented as productions.
02400 ⊗2ARGS⊗* 63% How many arguments? Which are optional? Are there any local variables?
02500 ⊗2ARG-CHECK⊗* 81% Predicates which examine each argument for suitability.
02600 ⊗2EVAL-ARGS⊗* 4% Which arguments are to be evaluated, and which quoted?
02700 ⊗2WHAT⊗* 82% A brief summary of the global purpose. Usually a template for an English sentence.
02800 ⊗2WHY⊗* 77% A justification of the BEING's existence. The caller explains here why it was called.
02900 ⊗2HOW⊗* 72% A summary of the methods the BEING intends to employ. Global strategies.
03000 ⊗2EFFECTS⊗* 27% What will probably be true after this BEING is through. What can it achieve?
03100 ⊗2WHEN⊗* 19% Why should this BEING be given control now? Computed as the sum of weighted factors.
03200 ⊗2META-CODE⊗* 70% The body of the executable code, but with uninstantiated subparts.
03300 ⊗2COMMENTS⊗* 16% Hints for filling in undefined sections of other BEING parts.
03400 ⊗2REQUISITES⊗* 10% If this BEING is actually chosen, what must be made true just before
03500 (pre-), during (co-), and after (post-) execution? Work to make these true.
03600 ⊗2DEMONS⊗* 7% Which demons should be kept active while the BEING is in control?
03700 ⊗2AFFECTS⊗* 14% Which other BEINGs might be called by this BEING, and why?
03800 ⊗2COMPLEXITY⊗* 92% A vector of utility measures, including ease of calling, chance of success,
03900 chance of calling* itself, expected time cost, efficiency of its output results.
04000 ⊗2SPECIALIZATIONS⊗* 40% How to write a more streamlined, special-case version of this BEING.
04100 ⊗2GENERALIZATIONS⊗* 27% What are some other BEINGs, encompassing this one?
04200 ⊗2ALTERNATIVES⊗* 16% What are some equivalent BEINGs? (to try if this one fails)
04300 ⊗2RESULTS⊗* 15% How many values does this return? What domain is each in? Any side effects?
04400 ⊗2STRUCTURE⊗* 4% If this is ever viewed as a data structure, what operations are done to it, and how?
04500 ⊗2ENCODABLE⊗* 9% How to order the evaluation of the other parts, when writing a new version.
04600 ⊗2INHIBIT-DEMONS⊗* 5% A lock/unlock mechanism, useful when handling demonic interrupts.
04700 ⊗2FORM-CHANGE⊗* 1% A last-minute addition: how to alter the whole shape of a new BEING's parts.
04800 .SELECT 1
04900
05000 ↓_3.2. A high-level domain-independent BEING_↓
05100
05200 .END
05300 OBTAIN-USABLE-INFORMATION is a typical high-level,
05400 domain-independent BEING. The WHEN part consists of a set of
05500 "⊗5if⊗* <predicate> ⊗5then⊗* <value> ⊗5because⊗* <reason>" expressions.
05600 If the predicate evals to non-null, then the value program is executed.
05700 It returns a number, which is added in as a factor to the final WHEN
05800 value. The reason evaluates to a quasi-English phrase for inquisitive
05900 human users.
06000 The final sum reflects how good this BEING would be to pass control to.
06100 This "linear" scheme is undesirable, rough, but adequate.
06200 This particular BEING
06300 has factors which respond that it is ⊗4always⊗* an undesirable
06400 thing to do, but ⊗4may⊗* be indicated if there exists
06500 new but not yet usable information.
06600 These factors, and their weights (see below), like all the parts
06700 of all the BEINGs initally in PUP6, were decided upon and inserted by
06800 hand.
06900 The IDEN parts are collected together into a large
07000 translation table. When a form ⊗4X⊗* must be translated, the
07100 TRANSLATE BEING goes through this table, asking each IDEN part if it
07200 claims to recognize ⊗4X⊗*. Each IDEN runs its own little program,
07300 typically some type of pattern match involving ⊗4X⊗* and the state
07400 of the world, to answer this question.
07500 If there is more than one responder,
07600 CHOOSE-FROM picks the one with the highest priority of match. The
07700 winner runs a little program which ultimately returns a BEING-call or
07800 a constant as the translated value of ⊗4X⊗*. This program might call
07900 other BEINGs, often calls TRANSLATE several times recursively on
08000 parts of ⊗4X⊗*. Consider the IDEN part of the
08100 OBTAIN-USABLE-INFORMATION BEING (below).
08200 There are just two classes of phrases that this BEING can recognize.
08300 For example, the second usage is
08400 if some BEING or user wants to find out more about anything, then
08500 OBTAIN-USABLE-INFORMATION can be called with that thing as argument.
08600 The EFFECTS parts of each BEING are similarly collected into
08700 one table to facilitate asking all BEINGs simultaneously "Can you
08800 cause effect X to occur?" Each EFFECTS part examines X and the world, and
08900 either replies No, or else returns a little program (involving calls
09000 and constants) which should produce the effect, with a certain
09100 probability. This program generally will involve a call to the BEING
09200 itself. OBTAIN-USABLE-INFORMATION shows that it should be called to acheive the
09300 existence of new, usable information (see the MAIN:EFFECTS part, below).
09400 The WHAT, HOW, and WHY parts are mainly for the user's
09500 benefit.
09600 When a choice between BEINGs must be made, the WHEN,
09700 AFFECTS, and COMPLEXITY-VECTOR parts are queried.
09800 If a new,
09900 manipulated version of the BEING must be created, the
10000 SPECIALIZATIONS, ENCODABLE, DATA-STRUCTURE, PREDICATE, and
10100 FORM-CHANGING parts might be relevant.
10200 If the BEING fails, some
10300 BEING specified in the ALTERNATIVES or GENERALIZATIONS parts might
10400 be tried.
10500 In the current case, the COMPLEXITY-VECTOR says that it is of
10600 average difficulty to call, its descendants may (.5 chance) call it
10700 back, it has an average chance of success and cost of attempting it,
10800 and there is no (.1) good reason to defer it any longer.
10900 The AFFECTS part of the OBTAIN-USABLE-INFORMATION BEING is
11000 clear. CHOOSE-FROM is definitely called, and four others might be.
11100 The absence of some parts, like DATA-STRUCTURE, PREDICATE,
11200 and NLAMBDA, indicate that these qualities don't apply. The absence
11300 of other parts, like SPECIALIZATIONS and ALTERNATIVES, indicate
11400 only a
11500 lack of thoroughness in filling out unneeded BEING parts.
11600 The META-CODE says to choose from the following four
11700 alternatives, each of which is itself a BEING:
11800 Get-Brand-New-Information (in English, from the user), Translate
11900 something (transform from English to BEING-calls),
12000 Analyze-The-Implications (of some existing, translated information),
12100 Extract-a-Relevant-Subset (of the existing information) to
12200 concentrate upon.
12300 SATISFY is employed only when the BEING doesn't know exactly
12400 which BEING to call. If the ⊗4set⊗* of possible choices is known, as
12500 in this case, then CHOOSE-FROM is called with the choice set explicit.
12600 Below are exhibited the actual representation of this BEING
12700 in PUP6, and the function which would be executed if it were
12800 ⊗4called⊗*.
12900
13000 .BEGIN NOFILL
13100 .SELECT 6
13200
13300
13400 ⊗4↓_PART_↓ ↓_actual value of the part for OBTAIN-USABLE-INFORMATION BEING_↓⊗*
13500
13600 IDEN ((if you see: (OBTAIN-USABLE-INFORMATION any1)
13700 then return: (OBTAIN-USABLE-INFORMATION (TRANSLATE any1)))
13800 (if you see: (FIND OUT MORE ABOUT any1)
13900 then return: (OBTAIN-USABLE-INFORMATION any1)))
14000 EXPLICIT-ARGS (U)
14100 WHAT ( OBTAIN SOME INFORMATION WHICH CAN BE USED)
14200 HOW ( OBTAIN NEW FACTS ABOUT OLD INFORMATION, OR OBTAIN TOTALLY NEW INFORMATION)
14300 NEW INFORMATION)
14400 WHY ( PUP HAS NO MORE INFORMATION THAT IT CAN USE TO PROGRESS)
14500 WHEN ((if T then add in -10 because (THIS IS AN EXPONENTIALLY-GROWING, BAD THING TO DO IN GENERAL))
14600 (if NEW-PLO-LIST then add in 111
14700 because (WE SHOULD WORK ON UNASSIMILATED NEW INFORMATION IF THERE IS ANY)))
14800 META-CODE (DO
14900 (CHOOSE-FROM ((TRANSLATE U)
15000 (GET-NEW-INFORMATION U)
15100 (ANALYZE-IMPLICATIONS U)
15200 (EXTRACT-RELEVANT-SUBSET U)))
15300 BECAUSE
15400 (WE CAN ONLY TRY TO OBTAIN USABLE INFORMATION IN ONE WAY AT A TIME))
15500 MAIN-EFFECTS ((to get (NEW PLO any1) do (OBTAIN-USABLE-INFORMATION any1))
15600 (to get (USABLE PLO any1) do (OBTAIN-USABLE-INFORMATION any1)))
15700 AFFECTS ( (CHOOSE-FROM IS CALLED)
15800 (TRANSLATE POSSIBLY IS CALLED)
15900 (GET-NEW-INFORMATION POSSIBLY IS CALLED)
16000 (ANALYZE-IMPLICATIONS POSSIBLY IS CALLED)
16100 (EXTRACT-RELEVANT-SUBSET POSSIBLY IS CALLED) )
16200 COMPLEXITY-VECTOR (.5 .5 .5 .5 .1)
16300
16400 .END
16500
16600 .ONCE FLUSH LEFT
16700 ↓_3.3. What happens when a BEING is called_↓
16800
16900 When a BEING is ⊗4called⊗*, its parts are assembled into a
17000 function which is then executed. Here is the ⊗4FUNCTIONAL⊗* form of the
17100 OBTAIN-USABLE-INFORMATION BEING:
17200
17300 .BEGIN NOFILL
17400 .SELECT 6
17500
17600 (OBTAIN-USABLE-INFORMATION
17700 (LAMBDA (U FN-VALUE FINAL-CO-REQ)
17800 (PROG1
17900 (AND
18000 (SETQ BEING-STACK (CONS OBTAIN-USABLE-INFORMATION BEING-STACK))
18100 (PUT OBTAIN-USABLE-INFORMATION SPEC-WHY BECAUSE)
18200 (EVERY (APPEND CURRENT-DEMONS USER-INTERRUPT-DEMONS)
18300 (QUOTE APPLY*))
18400 (SETQ BECAUSE
18500 (QUOTE (WE CAN ONLY TRY TO OBTAIN USABLE
18600 INFORMATION IN ONE WAY AT A TIME)))
18700 (CHOOSE-FROM (QUOTE ((TRANSLATE U)
18800 (GET-NEW-INFORMATION U)
18900 (ANALYZE-IMPLICATIONS U)
19000 (EXTRACT-RELEVANT-SUBSET U)))))
19100 (SETQ BEING-STACK (CDR BEING-STACK)))))
19200 .END
19300
19400 The process of assembling the BEING parts into a function is
19500 straight-forward. First, the explicit arguments (those global to the
19600 BEING) are bound. The implicit arguments (those local to the BEING,
19700 like PROG variables) are initialized. The name of the BEING is pushed
19800 onto the BEING control stack (pointing to its caller), and any
19900 newly-activated demons are pushed onto the demon stack. The
20000 ARGS-CHECK predicates are evaluated. Then PUP6 works to make each
20100 PRE-REQUISITE true in the world. Each COMMENT is evaluated, then the
20200 CO-REQUISITES, META-CODE, and current demons all are executed in
20300 pseudo-parallel. Each POST-REQUISITE is then satisfied. Since these
20400 activities are all embedded in an AND, any of them may cause the
20500 entire BEING to halt and fail, simply by returning NIL.
20600 In both cases, just before exiting, the demon
20700 stack is popped and the BEING stack is updated (usually just popped),
20800 and control passes to the delegated BEING (usually the one who called
20900 this BEING, at the state when he called it.) A fancy context
21000 mechanism was available but not actually needed.
21100 The function which assembled all the BEINGs exploited the
21200 fact that it operated only at system load time. Thus if the BEING
21300 had no new DEMONs to activate, all the corresponding demon-stack
21400 manipulations could be omitted. Optimizations like these are clear
21500 from comparing the functional form of OBTAIN-USABLE-INFORMATION
21600 and the description of how it theoretically
21700 should be created (see above).
21800 Another example in this BEING is that no CO-REQUISITES
21900 occur, so no parallel processing need be simulated.
22000
22100
22200 .B
22300
22400 .E
22500 .ONCE FLUSH LEFT
22600 ↓_3.4. A high-level, domain-specific BEING_↓
22700
22800 PARTITION-A-DOMAIN is a high-level BEING, specific to the domain of
22900 writing inductive inference programs,
23000 whose basic idea is to divide up a space into categories. Only two
23100 BEING parts are filled in here which were absent from
23200 OBTAIN-USABLE-INFORMATION; namely, SPECIALIZATIONS and DEMONS. The
23300 SPECIALIZATIONS part says that to write a specific version of itself,
23400 a few decisions must be made. The results will then indicate what
23500 pieces of code get assembled together to form the new BEING. The
23600 partition may be only partial (an element of the domain belongs to no
23700 class of the partition), only weak (an element belongs to more than
23800 one class), and, most importantly, the specialized BEING should work
23900 by repeatedly doing some of the following three activities: accept a
24000 class-name from the user and guess some of its elements, accept an
24100 element from the user and try to guess which class it belongs to,
24200 or accept an <element, class-name> pair. Other BEINGs know
24300 about these accepting, guessing, and checking activities.
24400 The DEMONS part says that during the partitioning, the only
24500 new demon to keep active is the one called Fringe-of-Consciousness.
24600 In the course of
24700 writing GI, the system learns that the main task is one of
24800 grammatical inference. The Grammatical-Inference BEING then asserts
24900 that if the user ever mentions the word TEST, he may actually mean
25000 PARSE. This is placed in the IDEN part of the TEST BEING, and is
25100 therefore only demonized, waiting on the periphery of PUP6's
25200 concentration. It is in fact triggered later in the dialogue, as an
25300 inference surprising to the user.
25400
25500 .NOFILL
25600
25700 ↓_3.5. The front and rear parts of the dialogue which results in CF_↓
25800 .FILL
25900
26000 The dialogue to synthesize CF begins by PUP6 asking the
26100 user for any task. The user replies, ⊗4Write a program which does
26200 concept formation⊗*. There are many decisions that PUP6 knows about in
26300 writing a specialized concept formation program [Hempel],
26400 and it manages to
26500 defer them all. For example, it must choose between classificatory,
26600 comparative, and metrical concept formation. Since all three involve
26700 partitioning a domain of examples, PUP6 decides it can defer this
26800 choice until after code for the partitioning is synthesized. The
26900 effort of the system is now directed toward this "piece" of the
27000 program. When it is completed, an hour or two later, the system asks
27100 the user to make this decision. When he picks the first alternative,
27200 which involves ⊗4only⊗* partitioning, the system prints that it has
27300 finished the entire CF task, and dumps the newly created BEINGs onto
27400 a file.
27500
27600 .NOFILL
27700
27800 ↓_3.6. Admissability of popular AI language features_↓
27900 .FILL
28000
28100 The claim is now made that the presence of popular
28200 AI language features did not detract from the combinatorial behavior
28300 of the system, since BEINGs subsume each mechanaism used.
28400 Direct call by name may be viewed as a trivial type
28500 of pattern-directed call,
28600 where each BEING can recognize its own name. See the IDEN part of the
28700 OBTAIN-USABLE-INFORMATION BEING, for example.
28800 A demon
28900 in PUP6 is merely a function which gets executed periodically,
29000 and may occasionally trigger a BEING. This could be replaced by a
29100 BEING whose EXPLICIT-ARGS-CHECK part contains the triggering
29200 predicate, and whose META-CODE is simply a call by name directly to
29300 the BEING which is supposed to be activated.
29400 An assertion can be
29500 viewed as a BEING containing only an IDEN part; when the BEING
29600 recognizes a form which matches it, it returns the fully instantiated
29700 assertion.
29800 A function is equivalent to a BEING with only META-CODE,
29900 ARGS, and NLAMBDA
30000 parts; one knows almost nothing about it before executing it.
30100 Notice that
30200 the overheads saved by these mechanisms are substantial. For example,
30300 assertions replace an entire BEING call by a single CDR evaluation.
30400
30500 .NOFILL
30600
30700 ↓_3.7. CF: the concept formation target program_↓
30800 .FILL
30900
31000 The reader may have observed the static quality of these
31100 examples, and wished for one of BEINGs in action, passing control
31200 back and forth in order to do something. But to present a reasonable
31300 example of BEINGs interacting, it is necessary to understand
31400 something about the target task. Thus we describe CF, and then
31500 present further excerpts from the dialogue which results in its synthesis.
31600
31700 CF should operate by repeatedly scanning a scene and trying to name it. The
31800 scene is broken into a set of features and a set of objects. Each
31900 feature is a relation
32000 on one or more objects in the scene. Internally, the program
32100 maintains a model for each differently-named
32200 structure it has ever encountered.
32300 The model contains a description of the objects expected in the
32400 scene, a set of features which must be present in the scene (else it
32500 can't be the same as this concept), a set of features which must not
32600 be present in the scene, and a set of features which may or may not
32700 be present. Thus a model is like an archtypical scene plus a name.
32800 Each time the program is confronted by a new scene, the program must
32900 scan its models until it finds one which matches it. A model is said
33000 to match a scene if all the MUST features associated with that model
33100 are present in the scene, and all the MUSTNOT features are
33200 absent from the scene. CF
33300 informs the user of this guess,
33400 and accepts the proper concept name. If it guessed incorrectly, it
33500 modifies its models. The wrong guess model may have features added to
33600 its MUST or MUSTNOT sets; the correct concept name model may have to
33700 be modified or (if it's a new concept) created and inserted into the
33800 list of models. For more details, see [Winston].
33900
34000 .NOFILL
34100
34200 ↓_3.8. A dynamic example_↓
34300 .FILL
34400
34500 An example of the PUP6 community of BEINGs interacting will now
34600 be presented. Let's jump one third of the way into the dialogue which
34700 writes CF. The system learns it must compare the input scene against
34800 its internally stored models of concepts, until it finds one which
34900 doesn't fail the comparison. It asks the user what it means for
35000 scene S to fail the comparison to model M. The user replies,
35100 whenever M is incompatible with the observed features of S.
35200 The IDEN part of the
35300 CONTRADICTS BEING recognizes this sentence pattern, and transforms it
35400 to
35500 .NOFILL
35600 (FORSOME F IN M-FEATURES (specialized:contradicts F S-FEATURES)).
35700 .FILL
35800 The BEING
35900 which inserts this into the synthesized code requires that the user
36000 pick some proper name for the function specialized:contradicts;
36100 this will be a streamlined contradiction test written by the
36200 CONTRADICTS BEING. Say the user reponds by calling it IMPOSS. This naming
36300 and specializing is central to BEING creation: a BEING recognizes an
36400 instance of itself, and decides either to insert a call to itself or
36500 else to insert a call to a specialized version of itself. If any
36600 nontrivial decisions must be made, and can be settled at synthesis
36700 time, then the latter alternative is chosen. CONTRADICTS is too
36800 general a BEING to be called in an inner loop, so its content will
36900 be hammered out at synthesis time. On the other hand,
37000 FORSOME is so primitive that no specialized
37100 version of it is written normally.
37200 Here is the way this piece of the dialogue actually appears.
37300 The user's reponses are italicized.
37400 .BEGIN NOFILL FLUSH LEFT
37500
37600 PUP: Please type in a logical expression which is true iff we terminate the loop
37800 USER: ⊗4(Any feature in model-features is incompatible with scene-features)⊗*
37900 PUP: PUP wants USER to type in name for specialized version of CONTRADICTS
38000 USER: ⊗4IMPOSS⊗*
38100 .END
38200 Later in the dialogue, PUP6 decides it must
38300 expand the function IMPOSS. The CONTRADICTS BEING is again called on; this
38400 time it is asked how to write a specialized version of a
38500 contradiction test. It replies that SOME-OF the following types of
38600 contradiction may occur: PROBABILITY:0, PROBABILITY:1, and
38700 PROBABILITY:ε(0,1). This is the germ of the idea for the
38800 MUSTNOT/MUST/MAY categorization of features. The SOME-OF BEING takes
38900 control, and asks if the decision can be deferred. The DEFERRAL
39000 BEING looks about, first asking if there is any non-null piece of
39100 code that all three have in common. This fails, and eventually the
39200 DEFERRAL BEING reports failure. SOME-OF sees it has no basis upon
39300 which to form a guess, and wants somebody to ask the user. The
39400 ASK-USER BEING takes over, and trivially finds out what exactly it is
39500 that PUP6
39600 wants to learn. The names of the three choices are so cryptic that
39700 their WHAT and HOW parts are printed out to the user, as well as
39800 their names. The user types back his choices, in our case all three.
39900 SOME-OF composes three new function calls, separated by a
40000 conditional:
40100 .B
40200
40300 (COND ( (IS-OF-TYPE F PROBABILITY:0-PART-OF-M-FEATURES)
40400 (PROBABILITY:0-CONTRADICTION F S-FEATURES))
40500 ((IS-OF-TYPE F PROBABILITY:1-PART-OF-M-FEATURES)
40600 (PROBABILITY:1-CONTRADICTION F S-FEATURES))
40700 ((IS-OF-TYPE F PROBABILITYε(0,1)-PART-OF-M-FEATURES)
40800 (PROBABILITYε(0,1)-CONTRADICTION F S-FEATURES)))
40900 .E
41000 TRICHOTOMY recognizes this as a split on a function's values
41100 being =0, =1, or between zero and one. It asks whether this
41200 particular function can only range over the interval [0,1].
41300 PROBABILITY answers affirmatively, so SOME-OF replaces the final
41400 IS-OF-TYPE test by the constant T. Later on, PUP6 must worry about
41500 the other two tests, and about the three contradiction predicates.
41600 These latter entities know (their ENCODABLE parts tell them) that
41700 they are immediately encodable. A probability=0 contradiction means
41800 that arg1 must not occur in the world arg2 yet it does. So the
41900 META-CODE of the PROBABILITY:0-CONTRADICTION BEING is defined as
42000 (MEMBER arg1 arg2). This corresponds to a MUSTNOT feature F which is
42100 present in the world (in
42200 the set S-FEATURES, features of the scene.) A
42300 probability=1 contradiction occurs when a feature arg1 must occur in
42400 the world arg2, yet it doesn't. So its definition is simply (NOT
42500 (MEMBER arg1 arg2)). It is impossible for a feature with probability
42600 in (0,1) to be in contradiction with any world (lacking negated
42700 features), so this third predicate is defined as the constant NIL.
42800 That is, if F is a MAY feature of the model M, then there can be no
42900 contradiction between F and ⊗4any⊗* features of the scene S.
43000 When a BEING is said to do or recognize or know something, as
43100 in the preceding and following paragraphs, what is actually meant is
43200 that one or more of its parts specificly encode that knowledge. All
43300 the parts of the hundred BEINGs in PUP6 were coded by hand, not by
43400 each other. They then encoded other BEINGs, interacting only via
43500 the dialogue.
43600 The IS-OF-TYPE BEING recognizes that it must work hard to
43700 replace each of the two case tests, unless M-FEATURES is organized
43800 permanently into three parts (thus permitting the complex function
43900 "IS-OF-TYPE" to be replaced by the simple one "MEMBER" in the above
44000 piece of code). The STRUCTURE-INDUCING BEING will take over, to probe
44100 the permissability and the difficulty of such a constraint. It finds
44200 nothing about this tripartite structuring which could damage any
44300 earlier synthesized code, and asks the user's blessing. Notes are
44400 added to the list of coding warnings, stating that any reference to
44500 the entire list of M-FEATURES must be replaced by either APPEND of
44600 the three new lists, or else by three separate statements. GET-NAME
44700 is indirectly called, and he has the user name the three new sets of
44800 features; say he responds by calling them MUSTNOT, MUST, and MAY. The
44900 system would at this point say to draw an arrow on its graph of code,
45000 from the function call (IMPOSS F S-FEATURES) to the new block of code:
45100 .BEGIN GROUP NOFILL FLUSH LEFT SELECT 8
45200
45300 (COND ( (MEMBER F MUSTNOT-PART-OF-M)
45400 (MEMBER F S-FEATURES))
45500 ( (MEMBER F MUST-PART-OF-M)
45600 (NOT (MEMBER F S-FEATURES))
45700 ( T (COMMENT THIS "T" REPLACES "MEMBER F MAY-PART-OF-M")
45800 NIL))
45900 .END
46000 This is now the META:CODE part of the new BEING called IMPOSS.
46100 Most of the other parts are taken from its generalization, namely
46200 CONTRADICTS. During the course of writing this piece, however, some
46300 of these parts would be slightly changed. For example, its reason
46400 for existing would be strengthened when the simple MEMBER function
46500 calls replaced the slow IS-OF-TYPE BEING calls.
46600 Most of this processing is inter-BEING activity; the user is
46700 not needed for -- or even informed of -- most of it.
46800 He has the feeling
46900 of merely directing, constraining, and watching the activities of a
47000 busy programmer. Unfortunately, "the user" is not generic; there
47100 was only one successful user. As was mentioned earlier, PUP6 insists
47200 on doing structured programming, so its traces are superficially
47300 similar to macro expansion. Despite this concentration on planning,
47400 no BEINGs system
47500 should halt so long as any details remain.
47600
47700 .B
47800
47900 .E
48000 .ONCE FLUSH LEFT
48100 ↓_3.10. Excerpt from the synthesized program itself running_↓
48200
48300 The PUP6 system interacts with the user, resulting in the creation of
48400 CF, the desired target program. Like PUP6, CF is nothing but a pool of
48500 BEINGs. So, as it runs, the user can interrupt it and ask it questions
48600 about what it's doing. The answers are typically just some parts of the
48700 BEING currently in control. An excerpt of CF running without interruption
48800 is given below, followed by the same piece with interruptions.
48900 The user's responses are in italics, and ⊗4α:⊗*
49000 signifies he has interrupted. Some liberty has been taken with the
49100 English; e.g., the user really types ⊗4WHAT?⊗*, not ⊗4What are you doing?⊗*
49200 .BEGIN NOFILL FLUSH LEFT
49300
49400 READY TO ACCEPT BRAND NEW SCENE: ⊗4(??? (A B) (BLOCK A) (GREEN B))⊗*
49500 NOT AN ARCH. NOT A TOWER. MIGHT BE A ROW. MY GUESS IS ROW.
49600 WHAT IS IT REALLY? ⊗4NOTHING⊗*
49700 TRANSFER FROM MAY TO MUST SETS OF ROW: (TOUCH A B)
49800 READY TO ACCEPT BRAND NEW SCENE:
49900
50000 Here is the same excerpt, but with the user interrupting occasionally:
50100 ⊗4α: What are you doing?⊗*
50200 PUP: Divide a domain into subdomains, using specialized version of PARTITION-DOMAIN
50300 ⊗4How?⊗*
50400 PUP: By building up the partition gradually, by adding one <scene, name> pair
50500 ⊗4OK. Go back to work⊗*
50600
50700 READY TO ACCEPT BRAND NEW SCENE:
50800 ⊗4α: Why?⊗*
50900 PUP: PUP wants SCENE, and PUP doesn't have it, and PUP won't compute it
51000 ⊗4OK. What you asked for is: (??? (A B) (BLOCK A) (GREEN B))⊗*
51100
51200 ⊗4α: What are you doing?⊗*
51300 PUP: Break a piece off (??? (A B) (BLOCK A) (GREEN B)) to examine
51400 ⊗4What is the meta-code of the current BEING?⊗*
51500 PUP: The meta-code of SOME-PART-OF (version 14) is (CAR arg1)
51600 ⊗4OK. Go back to work.⊗*
51700
51800 ⊗4α: What are you doing?⊗*
51900 PUP: Go through the known space of possible names of concepts, and perform
52000 TEST (version 18) until termination criteria are satisfied
52100 ⊗4Current name of possible concept?⊗*
52200 PUP: ARCH
52300 ⊗4OK. Go back to work.
52400
52500 α: What are you doing?⊗*
52600 PUP: COMPARE (version 19) the MUST set of ARCH with ((BLOCK A) (GREEN B))
52700 ⊗4OK. Go back to work.⊗*
52800
52900 NOT AN ARCH.
53000 ⊗4α: Why?⊗*
53100 PUP: The ARCH MUST relation (SUPPORTS A B) is contradicted in the SCENE
53200 ⊗4OK. Go back to work.⊗*
53300
53400 NOT A TOWER. MIGHT BE A ROW. MY GUESS IS ROW.
53500 WHAT IS IT REALLY? ⊗4α: Why are you asking?⊗*
53600 PUP: PUP cannot resolve it, and must at this time, so PUP asks USER
53700 ⊗4OK. What you wanted is: NOTHING
53800
53900 α: What are you doing?⊗*
54000 PUP: If NOTHING is a concept, then modify its relation sets
54100 ⊗4Why?⊗*
54200 PUP: Equivalent to updating the internal model of a concept, fundamental to CF
54300 ⊗4OK. Go back to work.
54400
54500 α: What are you doing?⊗*
54600 PUP: NOTHING is not a concept name; see if NOTHING is a sign to end program
54700 ⊗4OK. Go back to work.
54800
54900 α: What are you doing?
55000 PUP: Go through space of concept names, and make user aware of changes
55100 ⊗4What BEINGs might this affect?⊗*
55200 PUP: MESSAGE is possibly called; some version of TEST is possibly called
55300 ⊗4OK. Go back to work.⊗*
55400
55500 TRANSFER FROM MAY TO MUST SETS OF ROW: (TOUCH A B)
55600 READY TO ACCEPT BRAND NEW SCENE:
55700
55800
55900 .END
56000 .ONCE FLUSH LEFT
56100 ↓_3.10. BEINGs applied to volleyball and mathematics_↓
56200
56300 Since the only existing BEINGs system does automatic programming,
56400 it is excusable to concentrate on that task. Let us briefly digress,
56500 however, to indicate how a community of BEINGs could do other activities.
56600 Consider first the simulation of a volleyball game. A
56700 BEINGs-based system might create a specialized BEING for each player,
56800 perhaps with a complexity vector indicating his abilities, reflexes,
56900 etc. The BEING in control would be the player hitting the ball. A
57000 specialized Choose-from BEING would decide who goes next,
57100 occasionally interpreting a tie between BEINGs vying for control as a
57200 collision on the court.
57300 A less bizarre enterprise is the understanding of mathematics.
57400 Such a project is now in progress at SAIL, encompassing
57500 learning, discovery, and intuition, in several domains of mathematics.
57600 The driving force in such a system is aesthetic: the desire to complete each
57700 BEING. Since the task is no longer to write programs, a different set
57800 of BEING parts is called for. For efficiency, the guidance strategies
57900 might be implemented as rules. There would be a BEING for each entity,
58000 be it object, operator, analogy, etc. Its parts might be as follows:
58100
58200 .BEGIN NOFILL FLUSH LEFT
58300
58400 ⊗2DEFINITIONS⊗* Several equivalent definitions
58500 ⊗2NAME⊗* Significant English variations
58600 ⊗2IDEN⊗* Tests to see if this operator is the one being referred to
58700 ⊗2NOT-IDEN⊗* Tests to see if this operator is irrelevant
58800 ⊗2QUICK-IDEN/NOT-IDEN⊗* Cheap tests to see if operator is/isn't relevant
58900 ⊗2EXAMPLES⊗* Includes trivial, typical, and advanced cases; non-examples
59000 ⊗2BOUNDARY⊗* Examples which characterize the limits of this concept
59100 ⊗2FIX-BOUNDARY⊗* How can you change a concept so it crosses this border?
59200 ⊗2IMAGE⊗* Analogic interpretations: ties to simpler concepts, to real-world
59300 ⊗2WHY⊗* Why is this worth naming as a separate BEING?
59400 ⊗2USE⊗* Where is this used frequently, to advantage?
59500 ⊗2GENERALIZATIONS⊗* What is this a special case of? The new features
59600 ⊗2SPECIALIZATIONS⊗* What are special cases of this? What is simplified?
59700 ⊗2REPRESENTATION⊗* How should this be represented internally?
59800 ⊗2OTHERS⊗* What objects and operators are associated with this concept?
59900 ⊗2RESTRICTIONS⊗* What can't one do it do, or do to it? Temporary?
60000 ⊗2VALUE⊗* Aesthetic worth, efficiency ratings, complexity, certainty, ubiquity
60100 ⊗2EFFECTS⊗* The range for an operator, the results of using this concept
60200 ⊗2VIEWS⊗* How to view as an operator, property, set, object, relationship
60300 ⊗2HANDLES⊗* How to compute (get a handle on) this concept
60400
60500 .END
60600
00100
00200 .B
00300
00400 .E
00500 .ONCE CENTER
00600 ⊗5↓_4. Automatic Programming Task_↓
00700
00800 .NOFILL
00900
01000 ↓_4.1. Choice of target programs_↓
01100 .FILL
01200
01300 Once the first experimental task (automatic program synthesis from specific
01400 dialogue) was decided upon, considerable attention was spent on
01500 choosing an appropriate domain of target programs. The choice,
01600 inductive inference (II), merits discussion. The obvious
01700 difficulty appears to be the complexity of the programs involved:
01800 they are two orders of magnitude larger than any previously
01900 synthesized programs. But there are four big attractions:
02000 (i) A wide range of complexity exists, from a one-page sequence
02100 extrapolator to Meta-Dendral.
02200 (ii) Each increasingly
02300 sophisticated II program uses many of the concepts embodied in
02400 simpler II programs.
02500 (iii) If a super-human target program is
02600 ever written, it could itself contribute to the field of Automatic
02700 Programming! (This remark is humorous in the seventies, but may be
02800 commonplace someday.)
02900 (iv) Since people (especially those who write
03000 AI programs) are the inductive
03100 inference experts, our reliance on introspection is as
03200 valid -- and potentially as valuable -- as chemists' protocols were
03300 to Dendral.
03400 After studying many sample programs from the II domain,
03500 sequence extrapolation [Pilvar] seemed the most reasonable beginning
03600 task. It was quickly learned that this was too easy: humans have only
03700 a few techniques for extrapolating sequences, and a very limited
03800 capacity for composing them. Thus a rather rigid sequence
03900 extrapolator writer may be capable of generating a large class of
04000 super-human programs (see section 4.2 on
04100 Sequence-Extrapolator-Writer, in [Green]).
04200 The next candidates were grammatical inference and concept
04300 formation [Hempel].
04400 Determined not to choose too simple a task again, the
04500 latter program was finally decided upon. The particular target was a
04600 variant of [Winston]. To make the goal more tractable, certain parts
04700 of Winston's description were ignored, in particular the heuristic
04800 graph-matching section.
04900 After the target concept formation program was specified, it
05000 was trimmed and then rewritten as a structured program [Gadwa]. Next,
05100 a complete dialogue was simulated between the user and a human
05200 programmer (referred to as the system-player) playing the role of an
05300 "intelligent" automatic programming system (similar to, e.g.,
05400 [Balzer]). The system-player kept careful records as he
05500 programmed, and tried to create a bug-free structured program. The
05600 dialogue was then annotated: after each user response, comments
05700 were inserted which described the "states" the system-player had gone
05800 through before printing his next response. This included blind paths
05900 which were tried, use of outside world knowledge, and, in general,
06000 was meant to be the "intelligence" necessary to do the task. The
06100 fear was that a system could be built which synthesized the concept
06200 formation program, and yet was so unintelligent that nothing was
06300 learned from it. (see section 4.1 on PW1, for example, in [Green]).
06400 Hopefully, any system which
06500 (i) got the target program correctly,
06600 (ii) followed this
06700 initial dialogue, and, most importantly, (iii) went
06800 through the same line of reasoning that the comments indicated the
06900 system-player
07000 followed, would be far along the road toward intelligence. A
07100 corollary of this incremental annotated protocol approach is that the
07200 abilities of the user must coincide with those of the subject who
07300 participated in the protocol (see also [Woods]). The system would be
07400 far along the road toward automatic programming if it also (iv) was
07500 able to write CF from several dialogues, from several users, with
07600 little preparation. PUP6 was not designed to do this, and in the end
07700 it proved to be a serious deficit. Henceforth, ⊗4protocol⊗* will
07800 refer to this user-player / system-player simulated dialogue.
07900
08000 At this point, a rough
08100 initial set of BEINGs surfaced. Each one had not
08200 much more than a name and a vague description of what it must do.
08300 The dialogue was cycled through again, painstakingly, and the
08400 comments were replaced by a description of which BEINGs would call
08500 which other BEINGs, why, and the results of each such call. The
08600 constraints on each BEING thus grew, sometimes changed, and
08700 occasionally a new BEING or BEING part had to be added to the
08800 community. This process produced a long
08900 hand trace of the system execution.
09000 About eighty BEINGs were needed: a dozen domain-specific ones and
09100 the rest domain-independent programming knowledge. About thirty
09200 different BEING parts were called for.
09300 A result of this method of incremental specification of
09400 BEINGs was that each BEING had only those parts which were going to be
09500 used sometime during the ensuing dialogue. This seemed too specific,
09600 so some effort was spent filling out parts that weren't strictly
09700 necessary to write the concept formation program. Perhaps more
09800 careful attention to this activity would have made the system more
09900 tolerant of changes in the dialogue.
10000 During the course of massaging
10100 PUP6 into writing the other target programs, very few additional
10200 parts had to be added to existing BEINGs. Typically, either an old
10300 part had to be generalized or else a new BEING created. After writing
10400 CF, GI, and PL, there are now an even hundred BEINGs in PUP6.
10500 Let us briefly describe GI and PL, the second and
10600 third programs PUP6 synthesized. GI is a simple
10700 grammatical inference program. It builds up a set of plausible rules,
10800 rather than enumerating through the space of all grammars. Yet it
10900 lacks most of the "tricks" of the best g.i. programs. The user
11000 repeatedly specifies a string, labelled LEGAL, ILLEGAL, or ??. In the
11100 latter case, GI must print out its guess and accept the correct label
11200 from the user. In all three cases, it must update its set of
11300 plausible rules to be at least consistent with all positive and
11400 negative instances observed. In some versions of GI the user coaxes
11500 PUP6 to generate, a parse tree may be maintained and inspected; in
11600 other versions, just a list of the rules used during a parse is kept.
11700 PL is a data-retrieval-on-keys system.
11800 The user repeatedly types
11900 in a request to insert, delete, or inspect a record (e.g., INSERT:
12000 PASSENGER Jones FLIGHT 741 FROM SFO TO LAX CARRIER TWA LEAVE 7:15
12100 ARRIVE 8:00). Any unspecified fields are treated as don't cares;
12200 thus the request is matched against entries in the data base.
12300 The records are stored as property lists.
12400 The table below shows how each type of knowledge was used in
12500 writing the three target programs. All numbers refer to BEINGs.
12600
12700 .BEGIN
12800 .GROUP
12900 .NOFILL
13000 .SELECT 6
13100 .BREAK
13200
13300 .ONCE CENTER
13400 ⊗5U S E D I N S Y N T H E S I Z I N G⊗*
13500
13600 TYPE OF CF CF CF CF GI PL not Crea Crea Crea Total
13700 KNOWLEDGE GI GI PL only only only used -ted -ted -ted BEINGs
13800 PL only only ever in CF in GI in PL
13900
14000 Programming 39 7 5 5 3 1 14 52 27 21 174
14100 Domain-Specific 0 3 0 9 8 1 5 4 10 3 43
14200 Total BEINGs 39 10 5 14 11 2 19 56 37 24 217
14300 .END
14400
14500 Although BEINGs can theoretically handle
14600 user errors, PUP6 was set up expecting that the user would never err
14700 or forget. He could, after the program was finished, try it out and
14800 describe how he wished it changed. Occasionally, PUP6 actually make
14900 the right modifications. For example, after
15000 the synthesis of PL is finished, the user tries
15100 out the program and finds that he is unable to delete more than one
15200 reservation with any single command. He complains about this, and
15300 PUP6 modifies the program so that he may specify a pattern, and all
15400 reservations matching that pattern will be purged.
15500 While assumptions of absolute
15600 user reliability are reasonable for little programs,
15700 they break down when the tasks reach the size of tens of pages and
15800 hours.
15900 Good measures of concentration of intelligence are not yet
16000 available. The only alternative is to present ephemeral numerical
16100 efficiency data, which now follows. PUP6 is 200 pages long when
16200 PRETTY-PRINTED, and has synthesized three programs, 7, 20, and 30
16300 pages long (1, 4, and 6 pages, when coded by hand.)
16400 PUP6 takes 60 cpu minutes to produce CF (compiled INTERLISP
16500 code, run on a PDP-10 TENEX system). During this time,
16600 300K characters get typed out to the user, who reponds with about
16700 4K himself.
16800 The mean wait time (between
16900 the user's input and the next machine output) is several seconds. The
17000 longest delay between successive user inputs is 1 CPU minute.
17100
17200 When the user sticks close to the original protocol, PUP6 asks the
17300 right questions at the right times because of its genesis from that
17400 annotated protocol. The second and third programs were attempted
17500 only to test its flexibility, and the results were mixed.
17600 First the bright side. The increment to PUP6 to enable it to
17700 write the GI and the PL programs is encouragingly small. Almost all
17800 the programming-knowledge BEINGs are called in writing more than one
17900 of the programs. It is now clear that very domain-specific BEINGs
18000 were in control at the early, high levels. Later, more general BEINGs
18100 took over, followed by pure programming BEINGs. The middle category
18200 would include inductive-inference-specific
18300 BEINGs like PARTITION-A-DOMAIN.
18400 Regrettably, incrementing the system with new domain-specific
18500 BEINGs is not feasable for the average user. To add a BEING, he must
18600 specify all of its relevant parts. Each is inputted either as LISP
18700 code, as an English sentence PUP6 is capable of interpreting, an
18800 English sentence PUP6 is just supposed to remember as a canned
18900 response, or by pointing to a part of an existing BEING and saying
19000 "like that one, except ...", where the ellipsis must be
19100 very simple. The user need not be familiar with the precise existing
19200 set of BEINGs, nor with their interactions; but the set of allowable
19300 assertion templates, the mechanisms by which each part is used, the
19400 special data types involved (VECTOR, TUPLE), and the precise actions
19500 of each new part ⊗4must⊗* be known in order to safely inject a new
19600 BEING.
19700 This may be a result of the small amount of knowledge in PUP6. One would
19800 hope that a BEINGs system which was expert in a domain could assist the
19900 user in adding new domain-specific BEINGs, in the same way that the
20000 BEINGs which make up the target code are synthesized, through dialogue.
00100
00200 .B
00300
00400 .E
00500 .ONCE CENTER
00600 ⊗5↓_5. Conclusions_↓
00700
00800 While the exact number of BEING parts is unimportant, its magnitude
00900 (a few tens) may be relevant to the feasability of BEINGs systems.
01000 If it were ~1, any uniform representation (e.g., predicate calculus)
01100 would be as easy to use; if it were ~1000, the task of creating
01200 the original pool of BEINGs would be too formidable and ill-structured.
01300
01400 The location of relevant information is
01500 effected by giving each BEING the responsibility
01600 for recognizing when it is talked about. This would beat the exponential
01700 nature of this problem, if it the computer used had as many processors as
01800 there are BEINGs. On a uniprocessor, this scheme proved to be very slow.
01900
02000 The only startling result was the "inteligent" style of the generated code.
02100 As shown in section 3.9, the synthesized programs can be queried as they
02200 run.
02300
02400 The set of questions the user was expected to want to ask is similar to the
02500 set of questions one BEING can ask another: the BEING parts. When a "nice"
02600 user interrupts, his question is answered by a simple retrieval. Real
02700 users were seldom nice; PUP6 often misunderstood what was asked.
02800
02900 To modify PUP6 to synthesize programs in a domain other than inductive
03000 inference, it would be necessary to add a few domain-independent BEINGs
03100 and several domain-specific ones, and to generalize some parts of some
03200 existing BEINGs.
03300 The second and third target programs were attempted to determine just
03400 how difficult this task is. Although only a few new BEINGs were
03500 needed, they could not have been added by an untrained user. The
03600 dialogues were not well-suited to their tasks (since the communication
03700 subsystem was generated from a concept-formation protocol) and were
03800 quite brittle.
03900
04000 There was no sophistocated user model, and this caused the entire
04100 system to suffer. The BEINGs didn't have any indication of how
04200 verbose they were in toto. Swamping the user with unneeded detail
04300 also increases his chances of erring. It finally became necessary to
04400 insert a FORGETFUL-USER demon, to prevent, e.g., delayed anaphora.
04500 Related to this is the problem of keeping the human user oriented.
04600 PUP6 used the obvious scheme (cursors pointing into a graph of the
04700 newly created BEINGs), which was insufficient. The user often wished
04800 to refer to a section not by name or by pointing, but rather by some
04900 brief, descriptive, meaningful (to him only!) phrase.
05000
05100 The design of BEINGs is toward large programs; low-level, efficient
05200 code could be turned out as a ⊗4simulation⊗* effort, much as playing
05300 volleyball. The analogue is using a high-level
05400 language to turn out efficient machine code.
05500
05600 Some of the BEING parts proved unnecessary. The CO-REQUISITES part was
05700 never used: the only simultaneous activity needed was demon
05800 activiation. The AFFECTS part was of occasional interest to the user,
05900 but not really used by any BEINGs. The EFFECTS part originally had a
06000 twin, describing what would happen if the BEING were ⊗4not⊗*
06100 executed then. In each case, this turned out to be merely a
06200 restatement of the frame problem.
06300
06400 The idea of a universal set of BEING parts (for a given system) fared
06500 marginally well. Each part was used in about a third of all BEINGs,
06600 and each BEING used about a third of all possible parts. If these
06700 fractions had been much lower, the advantage of uniformity would
06800 have vanished. The future of BEINGs (e.g., in the current
06900 math-understanding project) seems to be as a tool, a single aspect of
07000 the system.